Bootstrapping Ontology Evolution with Multimedia Information Extraction

نویسندگان

  • Georgios Paliouras
  • Constantine D. Spyropoulos
  • George Tsatsaronis
چکیده

This chapter summarises the approach and main achievements of the research project BOEMIE (Bootstrapping Ontology Evolution with Multimedia Information Extraction). BOEMIE introduced a new approach towards the automation of knowledge acquisition from multimedia content. In particular, it developed and demonstrated the notion of evolving multimedia ontologies, which is used for the extraction, fusion and interpretation of information from content of various media types (audio, video, images and text). BOEMIE adopted a synergistic approach that combines multimedia extraction and ontology evolution in a bootstrapping process. This process involves, on the one hand, the continuous extraction of semantic information from multimedia content in order to populate and enrich the ontologies and, on the other hand, the deployment of these ontologies to enhance the robustness of the extraction system. Thus, in addition to annotating multimedia content with semantics, the extracted knowledge is used to expand our understanding of the domain and extract even more useful knowledge. The methods and technologies developed in BOEMIE were tested in the domain of athletics, using large sets of annotated content and evaluation by domain experts. The evaluation has proved the value of the technology, which is applicable in a wide spectrum of domains that are based on multimedia content. 1 Motivation and Objectives of the BOEMIE Project BOEMIE aimed towards the automation of the knowledge acquisition process from multimedia content, which nowadays grows with increasing rates in both public and proprietary webs. Towards this end, it introduced the concept of evolving multimedia ontologies. The project was unique in that it linked multimedia extraction with ontology evolution, creating a synergy of great potential. In recent years, significant advances have been made in the area of automated extraction of low-level features from audio and visual content. However, little progress has been achieved in the identification of high-level semantic features 1 http://www.boemie.org/ G. Paliouras et al. (Eds.): Multimedia Information Extraction, LNAI 6050, pp. 1–17, 2011. c © Springer-Verlag Berlin Heidelberg 2011 2 G. Paliouras, C.D. Spyropoulos, and G. Tsatsaronis or the effective combination of semantic features derived from various modalities. Driven by domain-specific multimedia ontologies, BOEMIE information extraction systems are able to identify high-level semantic features in image, video, audio and text and fuse these features for improved extraction. The ontologies are continuously populated and enriched using the extracted semantic content. This is a bootstrapping process, since the enriched ontologies in turn drive the multimedia information extraction system. Figure 1 provides a graphical illustration of this iterative bootstrapping process, that is implemented in the BOEMIE prototype. The main proposal of the project is illustrated by the continuous iteration that resides at the heart of the process. Information extraction is driven by semantic knowledge, while feeding at the same time the evolution of the ontologies. Through the proposed synergistic approach, BOEMIE aimed at large-scale and precise knowledge acquisition from multimedia content. More specifically, the objectives of the project were: Unifying representation for domain and multimedia knowledge. This multimedia semantic model follows modular knowledge engineering principles and captures the different types of knowledge involved in knowledge acquisition from multimedia. It realises the linking of domain-specific ontologies, which model salient subject matter entities, and multimedia ontologies, which capture structural and low-level content descriptions. Fig. 1. The BOEMIE bootstrapping process Bootstrapping Ontology Evolution with Multimedia Information Extraction 3 Methodology and toolkit for ontology evolution. The proposed methodology coordinates the various tools that use the extracted data to populate and enrich the ontologies. The toolkit provides tools to support ontology learning, ontology merging and alignment, semantic inference and ontology management. Methodology and toolkit for information extraction. The methodology specifies how information from the multimedia semantic model can be used to achieve extraction from various media. Additionally, it fuses information extracted from multiple media to improve the extraction performance. The toolkit comprises tools to support extraction from image, audio, video and text, as well as information fusion. The resulting technology has a wide range of applications in commerce, tourism, e-science, etc. One of the goals of the project was to evaluate the technology, through the development of an automated content collection and annotation service for athletics events in a number of major European cities. The extracted semantic information enriches a digital map, which provides an innovative and friendly way for the end user to access the multimedia content. Figure 2 illustrates this interaction of the end user with the system, which is provided by a specialised Web application, called the BOEMIE semantic browser. Points of interest that are associated with interesting multimedia content are highlighted on the map. The geo-referencing of the content is facilitated by the information extraction process of BOEMIE. Fig. 2. The map-based interface to multimedia content in the semantic browser The rest of this chapter is structured as follows. Section 2 presents briefly the main modules of the prototype system that was developed in BOEMIE. Section 3 compares BOEMIE to related projects that took place either before or in parallel with it. Finally, section 4 summarizes the main achievements of the project and proposes interesting paths for further research. 4 G. Paliouras, C.D. Spyropoulos, and G. Tsatsaronis Fig. 3. The Multimedia Semantic Model. AEO (Athletics Event Ontology) models the scenario domain of interest, i.e. public athletics events. GIO (Geographic Information Ontology) models information relevant to geographic objects. MCO (Multimedia Content Ontology) models content structure descriptions, based on MPEG-7 MDS definitions. MDO (Multimedia Descriptor Ontology) models the MPEG-7 visual and audio descriptors. 2 The BOEMIE Prototype More than 100 different modules and components have been produced in the course of the BOEMIE project, some of which have been made available publicly. Most of the components that were produced have been incorporated in the integrated prototype that was delivered and evaluated at the end of the project. The BOEMIE integrated prototype implements the bootstrapping process, as illustrated in Figure 1. This sketch shows also the main components of the prototype, which are described in the remaining of this section. 2.1 Multimedia Semantic Model The BOEMIE Multimedia Semantic Model (MSM) [12,11] integrates ontologies that capture our knowledge about a particular domain, e.g. athletics, with ontologies that model knowledge about the structure and low-level descriptors pertaining to multimedia documents (Figure 3). Besides addressing the interlinking of multimedia document segments with the corresponding domain entities, MSM further enhances the engineering of 2 http://www.boemie.org/software Bootstrapping Ontology Evolution with Multimedia Information Extraction 5 Fig. 4. Linking knowledge between ontologies in the semantic model subject matter descriptions by distinguishing betweenmid-level (MLC) and highlevel (HLC) domain concepts and properties, a feature unique to the BOEMIE project. Instances of MLCs represent information that is directly extracted from the multimedia content, using the various analysis tools, e.g. the name of an athlete or her body in a picture. On the other hand, instances of HLCs are generated through reasoning-based interpretation of the multimedia content, using the domain ontology. Such engineering allows incorporating the analysis perspective into the domain conceptualisation, which in turn supports effective logic-grounded interpretation. The developed ontologies allow the utilisation of precise formal semantics throughout the chain of tasks involved in the acquisition and deployment of multimedia content semantics. In MSM, four OWL DL ontologies are linked in a way that supports the purposes of BOEMIE for semantics extraction, interpretation, evolution, as well as retrieval and representation of the acquired semantics. Figure 4 presents a simple example of this interlinking between the ontologies. This is also the main novelty of the BOEMIE Multimedia Semantic Model. 2.2 Recursive Media Decomposition and Fusion The information extraction toolkit of BOEMIE integrates a number of tools for content analysis and interpretation, using a recursive media decomposition and fusion framework. In the course of the project, innovative methods for the analysis of single-modality content were developed, going in most cases beyond the state-of-the-art. As Figure 5 illustrates, these methods cover most of the currently available types of media. Most importantly, they support the bootstrapping process through an evolving cycle of analysis of new content, learning of improved analysis models and discovering interesting objects and entities to add to the domain knowledge. The coordination of the evolving extraction process is achieved by a new method that was developed in BOEMIE and is called Recursive Media Decomposition and Fusion (RMDF) [21]. The method decomposes a multimedia document into its constituent parts, including embedded text in images and speech. It then relies on single-modality modules, the results of which are fused together in a common graph that complies with the domain ontology. In a final step, graph techniques are used to provide a consistent overall analysis of the multimedia 6 G. Paliouras, C.D. Spyropoulos, and G. Tsatsaronis Fig. 5. The Information Extraction Toolkit document. For instance, a Web page may be decomposed into several parts, one of which containing a video, which may in turn contain a static overlaid image, that embeds text, which refers to a person. This example shows the importance of recursive decomposition and corresponding fusion of results that come from video and text analysis. Regarding the single-modality modules, BOEMIE has developed innovative methods to: – detect and discover objects of various shapes and sizes in images [22], – track moving objects in video and classify movement phases, – identify and discover entities in text and relations amongst them [14,20], – detect overlay and scene text in video and perform optical character recognition on it [23,1], – recognise and discover audio events and interesting keywords in audio [4,5]. Figure 6 provides examples of such results. Most importantly, however, through interpretation and fusion, the RMDF is able to improve significantly the precision of multimedia analysis, be it in Web pages containing HTML text and images or video footage with audio commentary and overlay text. In addition to the novel decomposition and fusion approach, the single-modality tools support customization to any domain, by allowing the discovery of new semantics in content and learning to identify known objects and entities. Furthermore, the extraction toolkit is easily distributable and scalable, by dynamically integrating per media analysis techniques in an unrestricted number of servers, communicating through a computer network. 2.3 Abductive Multimedia Interpretation The interpretation of multimedia content in BOEMIE goes well beyond the usual extraction of semantics from individual media. Domain knowledge, in the form of ontologies, is being exploited by a reasoning-based interpretation service that Bootstrapping Ontology Evolution with Multimedia Information Extraction 7 Fig. 6. Sample results of single-media analysis tools operates in two levels: single-media interpretation and fusion. The interlinking of domain and multimedia ontologies in the semantic model (Fig. 4) support this process. Figure 7 illustrates the multi-level analysis and interpretation process. Both the single-media and the fusion services are supported by the same reasoning apparatus. Reasoning for multimedia interpretation is based on the RacerPro reasoning engine, which has been extended with many novel methods for the purposes of BOEMIE [3,17,16]. One of the main extensions is the use of abduction to generate interpretation hypotheses for what has been “observed” by the extraction tools. The new abductive query answering service of RacerPro is able, during query evaluation, to recognize non-entailed query atoms and hypothesize them. Since there might be more than one hypothesis (i.e. explanations), a set of 3 http://www.racer-systems.com/products/racerpro/ 8 G. Paliouras, C.D. Spyropoulos, and G. Tsatsaronis Fig. 7. Multimedia analysis and interpretation process scoring functions has been designed and implemented in order to prefer certain hypotheses over others. Given the complexity of the interpretation hypothesis (a.k.a. explanation) space, important optimizations have been developed in the reasoner, in order to cut down on the number of consistent and useful interpretations that are produced by the system. The novel abductive multimedia interpretation machinery of BOEMIE combines Description Logics, as a representation formalism for ontologies, with DLsafe rules that guide the search for interpretations. In the context of BOEMIE, methods to learn these rules have also been developed. 2.4 Pattern-Based Ontology Evolution The ontology evolution toolkit (OET) implements a pattern-based approach to the population and enrichment of the ontology, which is unique to BOEMIE [10]. In particular, two different cases have been identified for the population process, one in which a single interpretation is produced for a document and one in which more than one candidate interpretation is provided. Furthermore, two cases are defined for the enrichment process, one in which a high-level concept (HLC) and one in which a mid-level concept (MLC) is added. Each of those cases requires different handling in terms of the interaction with the domain expert and the modules that are employed for the semi-automated generation of new knowledge, e.g. concept enhancement, generation of relations and interpretation rules, etc. Figure 8 provides a high-level overview of these four cases (patterns P1 to P4) and the modules that are involved. The first two cases (P1 and P2), which are responsible for the population of the ontology with new instances, are primarily based on instance matching and grouping methods[7,9]. Novel methods have been developed for this purpose, in order to take advantage of the rich semantics of the BOEMIE semantic model and scale efficiently to large document sets. These methods have been incorporated in the HMatch ontology matching software [6], which is publicly available. A number of innovations have been made also in the area of ontology enrichment (patterns P3 and P4) [18,15]. The discovery of new concepts and properties 4 tt http://islab.dico.unimi.it/hmatch/ Bootstrapping Ontology Evolution with Multimedia Information Extraction 9 Fig. 8. Pattern-based ontology evolution is based on a new methodology that incorporates a set of ontology modification operators. Logical and statistical criteria are introduced for the choice of the most appropriate modifications to the ontology, given the observed data. Further to this data-driven enrichment, a concept enhancement method has been developed, matching new constructs to knowledge in external resources, e.g. on the Web. 2.5 Interface Components In addition to the core processing components, the BOEMIE prototype includes a number of interface components that facilitate the interaction of the users with the prototype, as well as the interaction among the components. Three of these components, which introduce a number of novel features are the Semantic Browser, the Semantic Manager and the Bootstrapping Controller. The BOEMIE Semantic Browser (BSB) provides an innovative interaction experience with multimedia content. It does so by supporting three modes of interaction with the multimedia content: – Interactive maps for multimedia retrieval. – Interactive content of media objects. – Dynamic suggestion of related information. 10 G. Paliouras, C.D. Spyropoulos, and G. Tsatsaronis Fig. 9. Suggesting information related to active media objects A screenshot of the map interface of BSB is shown in Fig. 2. Based on the information extraction technologies of BOEMIE, BSB can associate multimedia content with geopolitical areas and specific Points of Interest (PoIs) on digital maps. In this manner it provides direct access to the multimedia, through what we call “BOEMIE PoIs”. Furthermore, BSB uses the semantic annotations generated automatically by BOEMIE to make media objects interactive. More specifically, it automatically highlights relevant content of a specific domain on top of text or images to prepare the interface for further interaction possibilities. Finally, through interpretation, BOEMIE is able to generated deeper semantic information, e.g. the type of sport that an image depicts. Using this implicit knowledge, BSB provides context-sensitive advertisement and suggests related information. This is realized by the idea of context menus, illustrated in Fig. 9. The BOEMIE Semantic Manager (BSM) [8] is unique in its simplification of a complex and demanding process, i.e., that of adding semantics to multimedia content and maintaining the associated domain knowledge. As an interface to the OET, BSM provides three primary functionalities: – Population of the ontology with semantically annotated multimedia content. – Enrichment of the ontology with new knowledge that has been learned from data. – User-friendly interactive enhancement of new knowledge by the domain expert. BSM provides interactive selection/approval/rejection of the multimedia content interpretations automatically produced by the BOEMIE system, as well as (similarity-based) document browsing facilities. In order to make the process accessible to the non-skilled in knowledge engineering, it creates a natural language description of the underlying logic representation of ontology instances. Additionally, BSM provides terminological and structural suggestions to support the domain expert in performing ontology enrichment. Suggestions are dynamically extracted from knowledge chunks similar to a given concept proposal by Bootstrapping Ontology Evolution with Multimedia Information Extraction 11 Fig. 10. Concept definition and enhancement, using the semantic manager exploiting ontology matching techniques. A repository of knowledge chunks is created and maintained through a knowledge harvesting process that periodically searches knowledge of interest from other ontologies, web directories, and, in general, external knowledge repositories. Finally, BSM incorporates a simple ontology editor [13], which uses natural language patterns and autocompletion techniques to facilitate the incorporation of new knowledge to the domain ontology. Figure 10 illustrates this editor. The Bootstrapping Controller (BSC) is the main application logic component that implements the iterative extraction and evolution process. Using the BSC, the content owner can add content to the Multimedia Repository (MMR) and then send it for processing through predefined workflows. The content is added to the repository, either by uploading specific files or by crawling the Web. Typically, a new document will be sent to RMDF for extraction and the results of its interpretation will be populated into the ontology. When sufficient evidence is accumulated, the OET will generate proposals for changes to the ontology. The domain expert will use these recommendations to change the ontology and the content will be sent again for processing by the RMDF. In some cases, new midlevel concepts (MLCs) will be generated based on the analysis of the multimedia content so far. In these cases, in addition to the extension of the ontologies, the BSC will send sufficient training data to the RMDF, asking for the re-training of the analysis modules. 12 G. Paliouras, C.D. Spyropoulos, and G. Tsatsaronis Fig. 11. Video and image annotation with the VIA tool 2.6 Manual Annotation Tools The information extraction methods developed in BOEMIE are trainable and therefore require training material, in order to learn to identify interesting entities, objects and relations among them in multimedia content. The BOEMIE bootstrapping process generates semi-automatically such training data. However, for the purposes of training and evaluating the initial extractors, we generated significant quantities of training data for all types of media: image, video, audio, text. For these data we used interactive tools for manual annotation. Most of these tools were also developed in BOEMIE and improve significantly the state of the art in the field. The VIA tool can be used for high-level and low-level video and image annotation. In both cases, annotation is aligned with concepts of the domain ontologies. In the case of image annotation, either image regions and complete images are linked with concepts (high-level annotation) or visual descriptors are extracted per annotated region and associated with the corresponding concept (low-level annotation). To reduce the manual annotation burden, VIA supports the automatic segmentation of a still image into regions and region-merging. Regarding video annotation, VIA supports input in MPEG1/2 video format and frame accurate video playback and navigation. Video annotation can take place either in a frame-by-frame style or as live annotation during playback. Figure 11 illustrates the use of VIA. The text and HTML annotation tool BTAT [19] has been developed over the Ellogon open-source text engineering platform. It supports the annotation of named entities, the mid-level concepts (MLCs), as well as relations between those named entities. The relations are grouped in tables of specific types. Tables correspond to high-level concepts (HLCs). Furthermore, the tool enables the annotation of relations between HLC instances by creating links between tables in an effective and easy way. One of the innovations in BTAT is its dual manual and automated annotation functionality. Manual annotation is facilitated by a smart 5 http://mklab.iti.gr/project/via 6 http://www.boemie.org/btat 7 http://www.ellogon.org/ Bootstrapping Ontology Evolution with Multimedia Information Extraction 13 Fig. 12. Text and HTML annotation with the BTAT tool text-marking system, where the user selects with a mouse click words, instead of single characters. Automatic annotation works by matching user-defined regular expressions. Figure 12 illustrates the use of BTAT.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Boemie: Bootstrapping Ontology Evolution with Multimedia Information Extraction

The BOEMIE project proposes a bootstrapping approach to knowledge acquisition, which uses multimedia ontologies for fused extraction of semantics from multiple modalities, and feeds back the extracted information, aiming to automate the ontology evolution process.

متن کامل

Bootstrapping an Ontology-based Information Extraction System

Automatic intelligent web exploration will benefit from shallow information extraction techniques if the latter can be brought to work within many different domains. The major bottleneck for this, however, lies in the so far difficult and expensive modeling of lexical knowledge, extraction rules, and an ontology that together define the information extraction system. In this paper we present a ...

متن کامل

Ontology-Based Information Extraction under a Bootstrapping Approach

The authors present an ontology-based information extraction process, which operates in a bootstrapping framework. The novelty of this approach lies in the continuous semantics extraction from textual content in order to evolve the underlying ontology, while the evolved ontology enhances in turn the information extraction mechanism. This process was implemented in the context of the R&D project...

متن کامل

On the Need to Bootstrap Ontology Learning with Extraction Grammar Learning

The main claim of this paper is that machine learning can help integrate the construction of ontologies and extraction grammars and lead us closer to the Semantic Web vision. The proposed approach is a bootstrapping process that combines ontology and grammar learning, in order to semi-automate the knowledge acquisition process. After providing a survey of the most relevant work towards this goa...

متن کامل

Bootstrapping Biomedical Ontologies for Scientific Text using NELL

We describe an open information extraction system for biomedical text based on NELL (the Never-Ending Language Learner) (Carlson et al., 2010), a system designed for extraction from Web text. NELL uses a coupled semi-supervised bootstrapping approach to learn new facts from text, given an initial ontology and a small number of “seeds” for each ontology category. In contrast to previous applicat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011